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ABSTRACT 

Existing approaches to compressive sensing of frequency- 
sparse signals focuses on signal recovery rather than spectral 
estimation. Furthermore, the recovery performance is limited 
by the coherence of the required sparsity dictionaries and 
by the discretization of the frequency parameter space. In 
this paper, we introduce a greedy recovery algorithm that 
leverages a band-exclusion function and a polar interpolation 
function to address these two issues in spectral compressive 
sensing. Our algorithm is geared towards line spectral es- 
timation from compressive measurements and outperforms 
most existing approaches in fidelity and tolerance to noise. 

Index Terms — Compressive sensing, frequency-sparse 
signals, spectral estimation, polar interpolation 

1. INTRODUCTION 

One of the most popular thrusts in compressive sensing (CS) 
research has focused on the recovery of signals that are spec- 
trally sparse (i.e., that have a sparse frequency-domain rep- 
resentation) from a reduced number of measurements [1-5]. 
Such frequency-sparse signals bring up a novel issue in the 
formulation of the CS recovery problem: frequency-domain 
representations have a continuous parameter space, while CS 
is inherently rooted on discretized signal representations. 

Aiming for an increasingly dense sampling of the fre- 
quency parameter space introduces performance issues in 
sparsity-leveraging algorithms. In particular, increasing the 
resolution of the parameter sampling worsens the coherence 
of the dictionary that provides sparsity for relevant signals. 
This both prevents certain algorithms from finding the sparse 
representation successfully and introduces ambiguity on the 
choice of representations available for a signal in the dictio- 
nary. Initial contributions address such issues by modifying 
the sparsity prior, the recovery algorithm, or both, to be 
tailored to the intricacies of the signal representation [5-8]. 

Interestingly, CS recovery of frequency-sparse signals can 
be formalized in two different ways: recovery of the signal 
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samples, and recovery of the signal's component frequencies. 
Previous contributions have almost exclusively focused on the 
former; their performance for the latter goal is limited by the 
representation leveraged during CS. Particularly, the required 
discretization of the parameter space explicitly limits the per- 
formance of compressive frequency estimation. 

In this paper, we improve over existing approaches by in- 
troducing interpolation steps within CS recovery algorithms 
that break the discretization barrier implicit in CS and are 
able to improve the quality of frequency parameter estima- 
tion. While such interpolation is considered briefly and inte- 
grated to a simple recovery algorithm in [5], we introduce a 
novel polar interpolation approach that leverages the fact that 
frequency-sparse signals are translation-invariant in the fre- 
quency domain. We couple polar interpolation with a more 
sophisticated CS greedy recovery approach to improve the 
performance of spectral CS over existing algorithms. We pro- 
vide experimental evidence that shows improved frequency 
estimation performance against approaches previously pro- 
posed for spectral CS signal recovery: in some cases, our 
estimates are more precise than those from the baseline ap- 
proaches, while in other cases we match the precision of the 
baseline with greatly reduced computational complexity. 

2. BACKGROUND AND RELATED WORK 

Compressive sensing (CS) is a technique to simultaneously 
acquire and reduce the dimensionality of sparse signals in a 
randomized fashion. More precisely, in the CS framework, a 
signal f G C N is sampled by M linear measurements of the 
form y = Af, where A is an M x N sensing matrix and 
M <C N. In practice, the measurements are acquired in the 
presence of noise z, in which case we have y = Af + z. 

In many applications, the signal f is not sparse but has a 
sparse representation in some dictionary D. In other words, 
we have f = Dx, where x is A"-sparse (i.e. ||x||o < K). 
Under certain conditions on the matrix A [9, 10], we can re- 
cover x from the measurements y through the following £±- 
minimization problem (which we refer to as £i-synthesis): 

x = min ||x||i s.t. ||ADx - y|| 2 < e, (1) 

xGC™ 

where e is an upper bound on the noise level | |z| |2- Note that 



optimal recovery of x from the optimization in (1) is feasible 
only when the elements of the dictionary D form an orthonor- 
mal basis, and thus are incoherent [1,11]. However, in many 
applications, the signal of interest is sparse in an overcom- 
plete dictionary or a frame, rather than in a basis. 

This paper focuses on frequency-sparse signals, which 
can be modeled as a superposition of K complex sinusoids 
with arbitrary frequencies w = {cu\, oj 2 , ■ • ■ , uk}- The signal 
f = [fx f 2 ... /at] T is given by 

K 

fn = *ke j2 ™ kn , Cj k G [0, 1], B6{1,2 N}. (2) 

fe=i 

Such signals are sparse in the discrete-time Fourier transform 
(DTFT), when defined using an infinite dictionary. In prac- 
tice, a finite-length representation of the signal is required, 
and the transform of choice is the discrete Fourier trans- 
form (DFT). Unfortunately, the DFT coefficients for such a 
frequency-sparse signal are sparse only when the frequencies 
of the constituent sinusoids are integral. One way to remedy 
this problem would be to employ a dictionary corresponding 
to a finer discretization of the Fourier representation. We 
call such a dictionary a DFT frame of redundancy c G N, 
containing P = c ■ N elements, defined as: 



D = [d(wi) d(u 2 ) 



d(w P )] 



P 

up = p. 



d(ojp) — [di(wp) d 2 (u> p ) ... d N (uj p )] T , 



(3) 



where d n {u) = -^e j2 ™ n . However, the DFT frame vio- 
lates the incoherence requirement for the dictionary [5]. 

It has recently been shown in [6] that as far as the recovery 
of signal f (instead of the sparse coefficient vector x) is con- 
cerned, the coherence condition of the dictionary is not nec- 
essary, provided that the matrix D ff D is sufficiently sparse, 
where (-) H designates the Hermitian operation. In this case, 
the signal f can be recovered via l\-analysis. However, the 
matrix D H D is not sufficiently sparse for DFT frames. 

Alternatively, one can take advantage of structured spar- 
sity in spectral CS recovery by using a coherence inhibition 
model [5]. The resulting structured iterative hard threshold- 
ing (SIHT) algorithm can recover the frequency-sparse signal 
with a DFT frame by avoiding dictionary elements with high 
coherence. A variation of this method uses a band-exclusion 
function to achieve the same avoidance [8]. We can define the 
^-coherence band of the index set S as 

B V (S) ={J{i I fi{i, k) > 77}, i e {1,2,..., P}, (4) 

fees 

where /J,(i,k) = |(d(wj),d(wfc))| is the coherence between 
two atoms in the dictionary. The authors use the band- 
exclusion function to avoid selecting coherent dictionary ele- 
ments in various greedy algorithms, including Band-excluded 
Orthogonal Matching Pursuit (BOMP). 



More recently, it has been shown that one can recover a 
frequency-sparse signal from a random subset of its samples 
using atomic norm minimization [7]. The atomic norm of f 
is defined as the size of the smallest scaled convex hull of a 
continuous dictionary of complex exponentials. Thus, the re- 
covery procedure searches over a continuous dictionary rather 
than a discretized one. The atomic norm minimization can 
be implemented as a semidefinite program (SDP), which can 
be computationally expensive. In addition, this formulation 
does not account for measurement noise, and it is not clear if 
guarantees can be given for arbitrary measurement settings. 
Nonetheless, [7] motivates our formulation of algorithms that 
push past the discretization of the frequency parameter space. 

3. POLAR INTERPOLATION 
FOR FREQUENCY ESTIMATION 

One way to remedy the discretization of the frequency pa- 
rameter space implicit in CS is to use interpolation. In [12], a 
polar interpolation approach for translation-invariant signals 
has been derived. Such signals can be written as a linear com- 
bination of shifted versions of a waveform. In a nutshell, the 
interpolation procedure exploits the fact that translated ver- 
sions of a waveform form a manifold which lies on the surface 
of a hypersphere. Thus, any sufficiently small segment of the 
manifold can be well-approximated by an arc of a circle, and 
an arbitrarily-shifted waveform can be closely approximated 
by a point in such arc. 

The complex exponentials that compose a DFT frame also 
form a manifold over a hypersphere, and thus can be approx- 
imated by an arc of a circle. This is motivated by the fact 
that complex exponentials have translation-invariant Fourier 
transforms, which correspond to an isometric rotation of the 
time-domain vectors. In this case, the DFT frame samples 
the frequency parameter space with a steps size A = 1/c, 
and we approximate a segment of the manifold d(<Dj) : <Dj G 
\uj p — |,w p + y] by a circular arc containing the three ex- 
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ponentials {d(uj p — A), d(uj p ), d(uj p + y )}. Making use of 
trigonometric identities, the polar interpolator approximates 
exponentials d(wj), <Dj G [u p — -f-,w p + y], using linear 



combinations of the three exponentials [12]: 

d(u)j) w c(cjp) + r cos u ( w p) + r sin (~^® ) vU '. '• 
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where r is the £2 norm of each element of the dictionary and 
6 is the angle between d{uj p ) and d(uj p — In order to 
extend the above approximation to sums of J exponentials 
with frequencies £1 = {wi, uj 2i ■ ■ ■ , wj}, we define: 



f = C(fi)cr - U(fi)/3 - V(ft)7, 



(5) 



C(n)=[c(wi) c(lu 2 ) ■■■ c(uj)], 

U(fi) = [u(wi) u(w2) • • • u(wj)] , (6) 

V(n)=[v(wi) v(w2) ••• v(wj)], 

where a represents the amplitude of the signal and (3 and 7 
controls the frequency translations. The three coefficient vec- 
tors can be estimated using the following constrained convex 
optimization problem [12]: 

(a,/3, 7 )=T(y,A,ft) (7) 
1 , 
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dj > 0, 
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> for j = 1, . . . , J, 



o^r cos(#) < /3j < a^r, 



where A is the measurement matrix, and y is the received 
compressed signal. The constraints for the optimization prob- 
lem ensure that the solution consists of points on the arcs 
used for approximation. The first constraint ensures we have 
only nonnegative signal amplitudes. The second enforces the 
trigonometric relationship among each triplet ctj, (3j, and jj. 
The last constraint ensures that the angle between the solution 
and d(u>j) is restricted to the interval [0, 0\. It is necessary to 
scale (3 and 7 after the optimization problem [12]: 



(ft, 7,) 



(8) 



This is because the inequality of the second constraint should 
in fact be an equality. However, the equality would violate 
the convexity assumption of the optimization. After this nor- 
malization, we obtain the signal estimate from (6) and the 
frequency estimates using the one-to-one relation 

«jc(wj) + PMwj) + 7jv(wj) = Q.jd (uj + § tan" 1 ^)) 

(9) 

The optimization (7), when applied with all parameter values 
used in the dictionary D, is named continuous basis pursuit 
(CBP)in[12]: 



(a,/3,7) = T(y,A,JW), 



(10) 



where Qcbp = {^1,1^2, . . . , wp} is the set of all frequencies 
that appear in the DFT frame for our application of interest. 
As posed, CBP has a high computational complexity: it op- 
erates on matrices of size 3N, whereas other CS algorithms 
operate on matrices of size N. However, its interpolation step 
has one important advantage: translation-invariance and in- 
terpolation enables CBP to reconstruct arbitrary frequency- 
sparse signal while requiring only a small subset of the cor- 
responding dictionary. This makes it possible to incorporate 
the convex optimization solver into a greedy algorithm that 
quickly finds a rough estimate, which is then improved upon 
by a convex optimization solver. 



4. BAND-EXCLUDED 
INTERPOLATING SUBSPACE PURSUIT 

We incorporate the convex optimization (7) and band-exclusion 
(4) in a Subspace Pursuit algorithm [13]. We call this algo- 
rithm Band-Excluded Interpolating Subspace Pursuit (BISP), 
which is shown in Algorithm 1 . 

In the algorithm initialization, the best K correlating 
atoms are found and stored in S n by generating a proxy for 
the sparse signal. The K atoms are found iteratively, which 
deviates from the original Subspace Pursuit algorithm where 
the K atoms are found in one step. In each iteration, we trim 
the proxy based on the found atom and the band exclusion 
function B V (S), as defined in (4). In the main loop, we find 
the K best atom indices and add them to S n . From S n , we 
form a set fi consisting of all frequencies corresponding to 
the indices in S n along with all adjacent indices. This is 
necessary because the frequencies present in y may not be 
sufficiently incoherent and may therefore skew the peaks of 
the proxy estimate. Therefore, as a precaution, we include 
the closest neighbors on each side. The set ft is input to 
the convex optimization in (7) along with the measurement 
matrix and the received signal. 

In practice, we found that for noisy measurements it is of- 
ten preferable to move the minimization objective | |y — Af 1 1 2 
in (7) into a constraint. Moving this fidelity measure from 
the objective function to a constraint causes the optimization 
to return the sparsest set of coefficients that yields measure- 
ments within the noise range of the observation. If the output 
is non-existent or trivial, we move the fidelity metric from the 
objective function to the constraint (or vice versa). 

Algorithm 1 BISP 

INPUTS: Compressed signal y, sparsity K, measurement 
matrix A and spacing between dictionary elements A. 
OUTPUTS: Reconstructed signal f and frequency esti- 
mates u>. 

INITIALIZE: * = AD, i = 1, S° = 
while i < K do 

S° = S° U argmax, |(y, * g B {S°), 1 = 1 + 1 
end while 

y° = y- *s°*soy, n = 1 

LOOP: 

repeat 

i = l,S n = s™- 1 
while i < K do 

S n = S^Uargmax, |(y, i B Q (S n ), i = i + l 
end while 

a = (* S n)ty 
S n = supp(thresh(a, K)) 
n = U{A(s - 1), As, A(s + l)\s e S n } 
From T(y, A, 17) obtain f and u> using (9) and (6) 
y™ = y - Af , n = n + l 
until || y ; l || 2 > ly™- 1 ^ Vn < K 
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Fig. 1. Frequency estimation performance in noise-less case. 
The legend is shown in Fig. 2. 

5. NUMERICAL EXPERIMENTS 

To evaluate Algorithm 1, we have performed two numer- 
ical experiments. 1 We generated frequency-sparse signals 
of length N — 100 containing K = 4 complex sinusoids 
with frequencies selected uniformly at random. We used 
a DFT frame with c — 5 (A = 0.2Hz), and considered 
well-separated tones so that no two tones are closer than 
1Hz of each other. We performed Monte Carlo experiments 
and averaged over 30 experiments. As measurement ma- 
trix 2 we used a Gaussian matrix A 6 R Jl/xA '. We set 
M = kN, where k 6 (0, 1] is the CS subsampling rate. 
We compare our proposed Algorithm 1 with six state-of-the- 
art methods: t i-synthesis, ^-analysis, SIHT, SDP, BOMP, 
and CBP. As performance measure, we use the Hungarian 
algorithm [15, 16] to find the best matching between the es- 
timated and true frequencies. For the algorithms that return 
a dense DFT coefficient vector or a reconstructed signal 
synthesis, ^-analysis, SIHT, and SDP), we apply the MUSIC 
algorithm [17] on the reconstructed signal to estimate its fre- 
quencies. In the BISP and BOMP algorithms, we exclude 
atoms with coherence r\ > 0.25 using (4). 

For the first experiment, we explore a range of subsam- 
pling ratios k with noiseless measurements to verify the level 
of compression that allows for successful estimation. We set 
e = 10~ 10 for the relevant algorithms. The result of the nu- 
merical experiment is shown in Figure 1. In the noiseless 
case, SDP obtains the best result. The polar interpolation al- 
gorithms (CBP and BISP) both converge to a given estimation 
precision, which corresponds to the level of approximation 
error. When the number of measurements M is sufficiently 

'The documentation and code for these experiments are made freely 
available at http://www.sparsesampling.com/scspi, following 
the principle of Reproducible Research [14]. 

-For the SDP algorithm we used a random subsampling matrix, as the 
algorithm is only defined for such a measurement matrix. The authors would 
like to thank Gongguo Tang for providing the implementation of SDR 
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Fig. 2. Frequency estimation performance in noisy case. 





Noiseless 


Noisy 


^i-analysis 


9.5245 


8.8222 


^i-synthesis 


2.9082 


2.7340 


SIHT 


0.2628 


0.1499 


SDP 


8.2355 


9.9796 


BOMP 


0.0141 


0.0101 


CBP 


46.9645 


40.3477 


BISP 


5.4265 


1.4060 



Table 1. Average computation times in seconds. 

small, CBP outperforms £i-synthesis. The performance of 
BOMP and SIHT is worst among the algorithms tested. Sur- 
prisingly, while the DFT coefficients x found by £ i-synthesis 
are not sparse and do not match the original frequencies, the 
signal f is still reconstructed accurately, and so the MUSIC 
algorithm recovers the frequencies adequately. 

For the second experiment, we include measurement 
noise in the signal model. We fix n = 0.5 and vary the signal- 
to-noise ratio (SNR) from to 20 dB. In the noisy case, the 
polar interpolation algorithms perform best. This is because 
their interpolation step relies less on the sparsity of the signal 
and more on the known signal model and the fitting to a circle 
on the manifold. Additionally, the presence of noise renders 
the measurements non-sparse in the dictionaries used by the 
non-interpolating algorithms, hindering their performance. 

The computation time of the algorithms is also of impor- 
tance, and we have listed the average computation times in 
Table 1. We observed that most algorithms exhibit compu- 
tation time roughly independent of M, with the exception of 
£i -synthesis and CBP 3 . The table shows that the excellent per- 
formance of SDP in Figure 1 is tempered by its high computa- 
tional complexity, as well as its lack of flexibility on the mea- 
surement scheme. Moreover, the relaxation in BISP that ac- 
counts for the presence of noise reduces its computation time, 
increasing its performance advantage over SDP and CBP. 

3 See results at http : //www . sparsesampling . com/scspi. 
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